Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 88
Filtrar
1.
BMC Med Inform Decis Mak ; 22(Suppl 2): 348, 2024 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-38433189

RESUMO

BACKGROUND: Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by an unpredictable course of flares and remission with diverse manifestations. Lupus nephritis, one of the major disease manifestations of SLE for organ damage and mortality, is a key component of lupus classification criteria. Accurately identifying lupus nephritis in electronic health records (EHRs) would therefore benefit large cohort observational studies and clinical trials where characterization of the patient population is critical for recruitment, study design, and analysis. Lupus nephritis can be recognized through procedure codes and structured data, such as laboratory tests. However, other critical information documenting lupus nephritis, such as histologic reports from kidney biopsies and prior medical history narratives, require sophisticated text processing to mine information from pathology reports and clinical notes. In this study, we developed algorithms to identify lupus nephritis with and without natural language processing (NLP) using EHR data from the Northwestern Medicine Enterprise Data Warehouse (NMEDW). METHODS: We developed five algorithms: a rule-based algorithm using only structured data (baseline algorithm) and four algorithms using different NLP models. The first NLP model applied simple regular expression for keywords search combined with structured data. The other three NLP models were based on regularized logistic regression and used different sets of features including positive mention of concept unique identifiers (CUIs), number of appearances of CUIs, and a mixture of three components (i.e. a curated list of CUIs, regular expression concepts, structured data) respectively. The baseline algorithm and the best performing NLP algorithm were externally validated on a dataset from Vanderbilt University Medical Center (VUMC). RESULTS: Our best performing NLP model incorporated features from both structured data, regular expression concepts, and mapped concept unique identifiers (CUIs) and showed improved F measure in both the NMEDW (0.41 vs 0.79) and VUMC (0.52 vs 0.93) datasets compared to the baseline lupus nephritis algorithm. CONCLUSION: Our NLP MetaMap mixed model improved the F-measure greatly compared to the structured data only algorithm in both internal and external validation datasets. The NLP algorithms can serve as powerful tools to accurately identify lupus nephritis phenotype in EHR for clinical research and better targeted therapies.


Assuntos
Lúpus Eritematoso Sistêmico , Nefrite Lúpica , Humanos , Nefrite Lúpica/diagnóstico , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Fenótipo , Doenças Raras
2.
Lupus Sci Med ; 10(2)2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37857531

RESUMO

OBJECTIVE: To assess the application and utility of algorithms designed to detect features of SLE in electronic health record (EHR) data in a multisite, urban data network. METHODS: Using the Chicago Area Patient-Centered Outcomes Research Network (CAPriCORN), a Clinical Data Research Network (CDRN) containing data from multiple healthcare sites, we identified patients with at least one positively identified criterion from three SLE classification criteria sets developed by the American College of Rheumatology (ACR) in 1997, the Systemic Lupus International Collaborating Clinics (SLICC) in 2012, and the European Alliance of Associations for Rheumatology and the ACR in 2019 using EHR-based algorithms. To measure the algorithms' performance in this data setting, we first evaluated whether the number of clinical encounters for SLE was associated with a greater quantity of positively identified criteria domains using Poisson regression. We next quantified the amount of SLE criteria identified at a single healthcare institution versus all sites to assess the amount of SLE-related information gained from implementing the algorithms in a CDRN. RESULTS: Patients with three or more SLE encounters were estimated to have documented 2.77 (2.73 to 2.80) times the number of positive SLE attributes from the 2012 SLICC criteria set than patients without an SLE encounter via Poisson regression. Patients with three or more SLE-related encounters and with documented care from multiple institutions were identified with more SLICC criteria domains when data were included from all CAPriCORN sites compared with a single site (p<0.05). CONCLUSIONS: The positive association observed between amount of SLE-related clinical encounters and the number of criteria domains detected suggests that the algorithms used in this study can be used to help describe SLE features in this data environment. This work also demonstrates the benefit of aggregating data across healthcare institutions for patients with fragmented care.


Assuntos
Lúpus Eritematoso Sistêmico , Reumatologia , Humanos , Estados Unidos , Lúpus Eritematoso Sistêmico/diagnóstico , Lúpus Eritematoso Sistêmico/epidemiologia , Índice de Gravidade de Doença , Registros Médicos , Avaliação de Resultados da Assistência ao Paciente
3.
Nat Commun ; 14(1): 6030, 2023 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-37758692

RESUMO

Influenza A Virus (IAV) is a recurring respiratory virus with limited availability of antiviral therapies. Understanding host proteins essential for IAV infection can identify targets for alternative host-directed therapies (HDTs). Using affinity purification-mass spectrometry and global phosphoproteomic and protein abundance analyses using three IAV strains (pH1N1, H3N2, H5N1) in three human cell types (A549, NHBE, THP-1), we map 332 IAV-human protein-protein interactions and identify 13 IAV-modulated kinases. Whole exome sequencing of patients who experienced severe influenza reveals several genes, including scaffold protein AHNAK, with predicted loss-of-function variants that are also identified in our proteomic analyses. Of our identified host factors, 54 significantly alter IAV infection upon siRNA knockdown, and two factors, AHNAK and coatomer subunit COPB1, are also essential for productive infection by SARS-CoV-2. Finally, 16 compounds targeting our identified host factors suppress IAV replication, with two targeting CDK2 and FLT3 showing pan-antiviral activity across influenza and coronavirus families. This study provides a comprehensive network model of IAV infection in human cells, identifying functional host targets for pan-viral HDT.


Assuntos
COVID-19 , Virus da Influenza A Subtipo H5N1 , Vírus da Influenza A , Influenza Humana , Humanos , Vírus da Influenza A/genética , Influenza Humana/genética , Virus da Influenza A Subtipo H5N1/genética , Vírus da Influenza A Subtipo H3N2/metabolismo , Proteômica , Replicação Viral/genética , SARS-CoV-2 , Antivirais/metabolismo , Interações Hospedeiro-Patógeno/genética
4.
Sci Rep ; 13(1): 8102, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37208478

RESUMO

The objective of this study was to investigate the potential association between the use of four frequently prescribed drug classes, namely antihypertensive drugs, statins, selective serotonin reuptake inhibitors, and proton-pump inhibitors, and the likelihood of disease progression from mild cognitive impairment (MCI) to dementia using electronic health records (EHRs). We conducted a retrospective cohort study using observational EHRs from a cohort of approximately 2 million patients seen at a large, multi-specialty urban academic medical center in New York City, USA between 2008 and 2020 to automatically emulate the randomized controlled trials. For each drug class, two exposure groups were identified based on the prescription orders documented in the EHRs following their MCI diagnosis. During follow-up, we measured drug efficacy based on the incidence of dementia and estimated the average treatment effect (ATE) of various drugs. To ensure the robustness of our findings, we confirmed the ATE estimates via bootstrapping and presented associated 95% confidence intervals (CIs). Our analysis identified 14,269 MCI patients, among whom 2501 (17.5%) progressed to dementia. Using average treatment estimation and bootstrapping confirmation, we observed that drugs including rosuvastatin (ATE = - 0.0140 [- 0.0191, - 0.0088], p value < 0.001), citalopram (ATE = - 0.1128 [- 0.125, - 0.1005], p value < 0.001), escitalopram (ATE = - 0.0560 [- 0.0615, - 0.0506], p value < 0.001), and omeprazole (ATE = - 0.0201 [- 0.0299, - 0.0103], p value < 0.001) have a statistically significant association in slowing the progression from MCI to dementia. The findings from this study support the commonly prescribed drugs in altering the progression from MCI to dementia and warrant further investigation.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Doença de Alzheimer/diagnóstico , Estudos Retrospectivos , Registros Eletrônicos de Saúde , Progressão da Doença , Disfunção Cognitiva/tratamento farmacológico , Disfunção Cognitiva/epidemiologia , Disfunção Cognitiva/diagnóstico , Ensaios Clínicos Controlados Aleatórios como Assunto
5.
JAMIA Open ; 6(2): ooad032, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37181728

RESUMO

With the burgeoning development of computational phenotypes, it is increasingly difficult to identify the right phenotype for the right tasks. This study uses a mixed-methods approach to develop and evaluate a novel metadata framework for retrieval of and reusing computational phenotypes. Twenty active phenotyping researchers from 2 large research networks, Electronic Medical Records and Genomics and Observational Health Data Sciences and Informatics, were recruited to suggest metadata elements. Once consensus was reached on 39 metadata elements, 47 new researchers were surveyed to evaluate the utility of the metadata framework. The survey consisted of 5-Likert multiple-choice questions and open-ended questions. Two more researchers were asked to use the metadata framework to annotate 8 type-2 diabetes mellitus phenotypes. More than 90% of the survey respondents rated metadata elements regarding phenotype definition and validation methods and metrics positively with a score of 4 or 5. Both researchers completed annotation of each phenotype within 60 min. Our thematic analysis of the narrative feedback indicates that the metadata framework was effective in capturing rich and explicit descriptions and enabling the search for phenotypes, compliance with data standards, and comprehensive validation metrics. Current limitations were its complexity for data collection and the entailed human costs.

6.
PLoS One ; 18(5): e0283553, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37196047

RESUMO

OBJECTIVE: Diverticular disease (DD) is one of the most prevalent conditions encountered by gastroenterologists, affecting ~50% of Americans before the age of 60. Our aim was to identify genetic risk variants and clinical phenotypes associated with DD, leveraging multiple electronic health record (EHR) data sources of 91,166 multi-ancestry participants with a Natural Language Processing (NLP) technique. MATERIALS AND METHODS: We developed a NLP-enriched phenotyping algorithm that incorporated colonoscopy or abdominal imaging reports to identify patients with diverticulosis and diverticulitis from multicenter EHRs. We performed genome-wide association studies (GWAS) of DD in European, African and multi-ancestry participants, followed by phenome-wide association studies (PheWAS) of the risk variants to identify their potential comorbid/pleiotropic effects in clinical phenotypes. RESULTS: Our developed algorithm showed a significant improvement in patient classification performance for DD analysis (algorithm PPVs ≥ 0.94), with up to a 3.5 fold increase in terms of the number of identified patients than the traditional method. Ancestry-stratified analyses of diverticulosis and diverticulitis of the identified subjects replicated the well-established associations between ARHGAP15 loci with DD, showing overall intensified GWAS signals in diverticulitis patients compared to diverticulosis patients. Our PheWAS analyses identified significant associations between the DD GWAS variants and circulatory system, genitourinary, and neoplastic EHR phenotypes. DISCUSSION: As the first multi-ancestry GWAS-PheWAS study, we showcased that heterogenous EHR data can be mapped through an integrative analytical pipeline and reveal significant genotype-phenotype associations with clinical interpretation. CONCLUSION: A systematic framework to process unstructured EHR data with NLP could advance a deep and scalable phenotyping for better patient identification and facilitate etiological investigation of a disease with multilayered data.


Assuntos
Doenças Diverticulares , Diverticulite , Divertículo , Humanos , Registros Eletrônicos de Saúde , Estudo de Associação Genômica Ampla/métodos , Processamento de Linguagem Natural , Fenótipo , Algoritmos , Polimorfismo de Nucleotídeo Único
7.
Sci Rep ; 13(1): 1971, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36737471

RESUMO

The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Genômica , Algoritmos , Fenótipo
8.
Sci Rep ; 13(1): 294, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36609415

RESUMO

Left ventricular ejection fraction (EF) is a key measure in the diagnosis and treatment of heart failure (HF) and many patients experience changes in EF overtime. Large-scale analysis of longitudinal changes in EF using electronic health records (EHRs) is limited. In a multi-site retrospective study using EHR data from three academic medical centers, we investigated longitudinal changes in EF measurements in patients diagnosed with HF. We observed significant variations in baseline characteristics and longitudinal EF change behavior of the HF cohorts from a previous study that is based on HF registry data. Data gathered from this longitudinal study were used to develop multiple machine learning models to predict changes in ejection fraction measurements in HF patients. Across all three sites, we observed higher performance in predicting EF increase over a 1-year duration, with similarly higher performance predicting an EF increase of 30% from baseline compared to lower percentage increases. In predicting EF decrease we found moderate to high performance with low confidence for various models. Among various machine learning models, XGBoost was the best performing model for predicting EF changes. Across the three sites, the XGBoost model had an F1-score of 87.2, 89.9, and 88.6 and AUC of 0.83, 0.87, and 0.90 in predicting a 30% increase in EF, and had an F1-score of 95.0, 90.6, 90.1 and AUC of 0.54, 0.56, 0.68 in predicting a 30% decrease in EF. Among features that contribute to predicting EF changes, baseline ejection fraction measurement, age, gender, and heart diseases were found to be statistically significant.


Assuntos
Insuficiência Cardíaca , Função Ventricular Esquerda , Humanos , Registros Eletrônicos de Saúde , Estudos Longitudinais , Aprendizado de Máquina , Prognóstico , Estudos Retrospectivos , Volume Sistólico
9.
J Am Med Inform Assoc ; 30(3): 427-437, 2023 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-36474423

RESUMO

OBJECTIVE: The aim of this study was to analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the variability of logical constructs used. MATERIALS AND METHODS: A sample of 33 preexisting phenotype definitions used in research that are represented using Fast Healthcare Interoperability Resources and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries. RESULTS: Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts. Most use 4 or fewer medical terminologies. The number of codes used ranges from 5 to 6865, and value sets from 1 to 19. We found that the most common expressions used were literal, data, and logical expressions. Aggregate and arithmetic expressions are the least common. Expression depth ranges from 4 to 27. DISCUSSION: Despite the range of conditions, we found that all of the phenotype definitions consisted of logical criteria, representing both clinical and operational logic, and tabular data, consisting of codes from standard terminologies and keywords for natural language processing. The total number and variety of expressions are low, which may be to simplify implementation, or authors may limit complexity due to data availability constraints. CONCLUSIONS: The phenotype definitions analyzed show significant variation in specific logical, arithmetic, and other operators but are all composed of the same high-level components, namely tabular data and logical expressions. A standard representation for phenotype definitions should support these formats and be modular to support localization and shared logic.


Assuntos
Registros Eletrônicos de Saúde , Idioma , Fenótipo , Narração
10.
Nat Genet ; 54(8): 1103-1116, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35835913

RESUMO

The chr12q24.13 locus encoding OAS1-OAS3 antiviral proteins has been associated with coronavirus disease 2019 (COVID-19) susceptibility. Here, we report genetic, functional and clinical insights into this locus in relation to COVID-19 severity. In our analysis of patients of European (n = 2,249) and African (n = 835) ancestries with hospitalized versus nonhospitalized COVID-19, the risk of hospitalized disease was associated with a common OAS1 haplotype, which was also associated with reduced severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) clearance in a clinical trial with pegIFN-λ1. Bioinformatic analyses and in vitro studies reveal the functional contribution of two associated OAS1 exonic variants comprising the risk haplotype. Derived human-specific alleles rs10774671-A and rs1131454 -A decrease OAS1 protein abundance through allele-specific regulation of splicing and nonsense-mediated decay (NMD). We conclude that decreased OAS1 expression due to a common haplotype contributes to COVID-19 severity. Our results provide insight into molecular mechanisms through which early treatment with interferons could accelerate SARS-CoV-2 clearance and mitigate against severe COVID-19.


Assuntos
COVID-19 , 2',5'-Oligoadenilato Sintetase/genética , 2',5'-Oligoadenilato Sintetase/metabolismo , Alelos , COVID-19/genética , Hospitalização , Humanos , SARS-CoV-2/genética
11.
J Am Med Inform Assoc ; 29(9): 1449-1460, 2022 08 16.
Artigo em Inglês | MEDLINE | ID: mdl-35799370

RESUMO

OBJECTIVES: To develop and validate a standards-based phenotyping tool to author electronic health record (EHR)-based phenotype definitions and demonstrate execution of the definitions against heterogeneous clinical research data platforms. MATERIALS AND METHODS: We developed an open-source, standards-compliant phenotyping tool known as the PhEMA Workbench that enables a phenotype representation using the Fast Healthcare Interoperability Resources (FHIR) and Clinical Quality Language (CQL) standards. We then demonstrated how this tool can be used to conduct EHR-based phenotyping, including phenotype authoring, execution, and validation. We validated the performance of the tool by executing a thrombotic event phenotype definition at 3 sites, Mayo Clinic (MC), Northwestern Medicine (NM), and Weill Cornell Medicine (WCM), and used manual review to determine precision and recall. RESULTS: An initial version of the PhEMA Workbench has been released, which supports phenotype authoring, execution, and publishing to a shared phenotype definition repository. The resulting thrombotic event phenotype definition consisted of 11 CQL statements, and 24 value sets containing a total of 834 codes. Technical validation showed satisfactory performance (both NM and MC had 100% precision and recall and WCM had a precision of 95% and a recall of 84%). CONCLUSIONS: We demonstrate that the PhEMA Workbench can facilitate EHR-driven phenotype definition, execution, and phenotype sharing in heterogeneous clinical research data environments. A phenotype definition that integrates with existing standards-compliant systems, and the use of a formal representation facilitates automation and can decrease potential for human error.


Assuntos
Registros Eletrônicos de Saúde , Poli-Hidroxietil Metacrilato , Humanos , Idioma , Fenótipo
12.
Genome Med ; 14(1): 70, 2022 06 29.
Artigo em Inglês | MEDLINE | ID: mdl-35765100

RESUMO

BACKGROUND: Type 2 diabetes (T2D) is a worldwide scourge caused by both genetic and environmental risk factors that disproportionately afflicts communities of color. Leveraging existing large-scale genome-wide association studies (GWAS), polygenic risk scores (PRS) have shown promise to complement established clinical risk factors and intervention paradigms, and improve early diagnosis and prevention of T2D. However, to date, T2D PRS have been most widely developed and validated in individuals of European descent. Comprehensive assessment of T2D PRS in non-European populations is critical for equitable deployment of PRS to clinical practice that benefits global populations. METHODS: We integrated T2D GWAS in European, African, and East Asian populations to construct a trans-ancestry T2D PRS using a newly developed Bayesian polygenic modeling method, and assessed the prediction accuracy of the PRS in the multi-ethnic Electronic Medical Records and Genomics (eMERGE) study (11,945 cases; 57,694 controls), four Black cohorts (5137 cases; 9657 controls), and the Taiwan Biobank (4570 cases; 84,996 controls). We additionally evaluated a post hoc ancestry adjustment method that can express the polygenic risk on the same scale across ancestrally diverse individuals and facilitate the clinical implementation of the PRS in prospective cohorts. RESULTS: The trans-ancestry PRS was significantly associated with T2D status across the ancestral groups examined. The top 2% of the PRS distribution can identify individuals with an approximately 2.5-4.5-fold of increase in T2D risk, which corresponds to the increased risk of T2D for first-degree relatives. The post hoc ancestry adjustment method eliminated major distributional differences in the PRS across ancestries without compromising its predictive performance. CONCLUSIONS: By integrating T2D GWAS from multiple populations, we developed and validated a trans-ancestry PRS, and demonstrated its potential as a meaningful index of risk among diverse patients in clinical settings. Our efforts represent the first step towards the implementation of the T2D PRS into routine healthcare.


Assuntos
Diabetes Mellitus Tipo 2 , Estudo de Associação Genômica Ampla , Teorema de Bayes , Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Humanos , Estudos Prospectivos , Fatores de Risco
13.
JAMA Oncol ; 8(6): 835-844, 2022 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-35446370

RESUMO

Importance: Knowledge about the spectrum of diseases associated with hereditary cancer syndromes may improve disease diagnosis and management for patients and help to identify high-risk individuals. Objective: To identify phenotypes associated with hereditary cancer genes through a phenome-wide association study. Design, Setting, and Participants: This phenome-wide association study used health data from participants in 3 cohorts. The Electronic Medical Records and Genomics Sequencing (eMERGEseq) data set recruited predominantly healthy individuals from 10 US medical centers from July 16, 2016, through February 18, 2018, with a mean follow-up through electronic health records (EHRs) of 12.7 (7.4) years. The UK Biobank (UKB) cohort recruited participants from March 15, 2006, through August 1, 2010, with a mean (SD) follow-up of 12.4 (1.0) years. The Hereditary Cancer Registry (HCR) recruited patients undergoing clinical genetic testing at Vanderbilt University Medical Center from May 1, 2012, through December 31, 2019, with a mean (SD) follow-up through EHRs of 8.8 (6.5) years. Exposures: Germline variants in 23 hereditary cancer genes. Pathogenic and likely pathogenic variants for each gene were aggregated for association analyses. Main Outcomes and Measures: Phenotypes in the eMERGEseq and HCR cohorts were derived from the linked EHRs. Phenotypes in UKB were from multiple sources of health-related data. Results: A total of 214 020 participants were identified, including 23 544 in eMERGEseq cohort (mean [SD] age, 47.8 [23.7] years; 12 611 women [53.6%]), 187 234 in the UKB cohort (mean [SD] age, 56.7 [8.1] years; 104 055 [55.6%] women), and 3242 in the HCR cohort (mean [SD] age, 52.5 [15.5] years; 2851 [87.9%] women). All 38 established gene-cancer associations were replicated, and 19 new associations were identified. These included the following 7 associations with neoplasms: CHEK2 with leukemia (odds ratio [OR], 3.81 [95% CI, 2.64-5.48]) and plasma cell neoplasms (OR, 3.12 [95% CI, 1.84-5.28]), ATM with gastric cancer (OR, 4.27 [95% CI, 2.35-7.44]) and pancreatic cancer (OR, 4.44 [95% CI, 2.66-7.40]), MUTYH (biallelic) with kidney cancer (OR, 32.28 [95% CI, 6.40-162.73]), MSH6 with bladder cancer (OR, 5.63 [95% CI, 2.75-11.49]), and APC with benign liver/intrahepatic bile duct tumors (OR, 52.01 [95% CI, 14.29-189.29]). The remaining 12 associations with nonneoplastic diseases included BRCA1/2 with ovarian cysts (OR, 3.15 [95% CI, 2.22-4.46] and 3.12 [95% CI, 2.36-4.12], respectively), MEN1 with acute pancreatitis (OR, 33.45 [95% CI, 9.25-121.02]), APC with gastritis and duodenitis (OR, 4.66 [95% CI, 2.61-8.33]), and PTEN with chronic gastritis (OR, 15.68 [95% CI, 6.01-40.92]). Conclusions and Relevance: The findings of this genetic association study analyzing the EHRs of 3 large cohorts suggest that these new phenotypes associated with hereditary cancer genes may facilitate early detection and better management of cancers. This study highlights the potential benefits of using EHR data in genomic medicine.


Assuntos
Gastrite , Síndromes Neoplásicas Hereditárias , Pancreatite , Doença Aguda , Feminino , Predisposição Genética para Doença , Mutação em Linhagem Germinativa , Humanos , Masculino
14.
BMC Med Inform Decis Mak ; 22(1): 23, 2022 01 28.
Artigo em Inglês | MEDLINE | ID: mdl-35090449

RESUMO

INTRODUCTION: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. METHODS: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. RESULTS: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. DISCUSSION AND CONCLUSION: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Genômica , Humanos , Bases de Conhecimento , Fenótipo
15.
Circulation ; 145(12): 877-891, 2022 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-34930020

RESUMO

BACKGROUND: Sequencing Mendelian arrhythmia genes in individuals without an indication for arrhythmia genetic testing can identify carriers of pathogenic or likely pathogenic (P/LP) variants. However, the extent to which these variants are associated with clinically meaningful phenotypes before or after return of variant results is unclear. In addition, the majority of discovered variants are currently classified as variants of uncertain significance, limiting clinical actionability. METHODS: The eMERGE-III study (Electronic Medical Records and Genomics Phase III) is a multicenter prospective cohort that included 21 846 participants without previous indication for cardiac genetic testing. Participants were sequenced for 109 Mendelian disease genes, including 10 linked to arrhythmia syndromes. Variant carriers were assessed with electronic health record-derived phenotypes and follow-up clinical examination. Selected variants of uncertain significance (n=50) were characterized in vitro with automated electrophysiology experiments in HEK293 cells. RESULTS: As previously reported, 3.0% of participants had P/LP variants in the 109 genes. Herein, we report 120 participants (0.6%) with P/LP arrhythmia variants. Compared with noncarriers, arrhythmia P/LP carriers had a significantly higher burden of arrhythmia phenotypes in their electronic health records. Fifty-four participants had variant results returned. Nineteen of these 54 participants had inherited arrhythmia syndrome diagnoses (primarily long-QT syndrome), and 12 of these 19 diagnoses were made only after variant results were returned (0.05%). After in vitro functional evaluation of 50 variants of uncertain significance, we reclassified 11 variants: 3 to likely benign and 8 to P/LP. CONCLUSIONS: Genome sequencing in a large population without indication for arrhythmia genetic testing identified phenotype-positive carriers of variants in congenital arrhythmia syndrome disease genes. As the genomes of large numbers of people are sequenced, the disease risk from rare variants in arrhythmia genes can be assessed by integrating genomic screening, electronic health record phenotypes, and in vitro functional studies. REGISTRATION: URL: https://www. CLINICALTRIALS: gov; Unique identifier; NCT03394859.


Assuntos
Arritmias Cardíacas , Testes Genéticos , Arritmias Cardíacas/diagnóstico , Arritmias Cardíacas/genética , Predisposição Genética para Doença , Testes Genéticos/métodos , Genômica , Células HEK293 , Humanos , Fenótipo , Estudos Prospectivos
16.
JAMIA Open ; 4(4): ooab094, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34926996

RESUMO

OBJECTIVE: The objective of this study is to create a repository of computable, technology-agnostic phenotype definitions for the purposes of analysis and automatic cohort identification. MATERIALS AND METHODS: We selected phenotype definitions from PheKB and excluded definitions that did not use structured data or were not used in published research. We translated these definitions into the Clinical Quality Language (CQL) and Fast Healthcare Interoperability Resources (FHIR) and validated them using code review and automated tests. RESULTS: A total of 33 phenotype definitions met our inclusion criteria. We developed 40 CQL libraries, 231 value sets, and 347 test cases. To support these test cases, a total of 1624 FHIR resources were created as test data. DISCUSSION AND CONCLUSION: Although a number of challenges were encountered while translating the phenotypes into structured form, such as requiring specialized knowledge, or imprecise, ambiguous, and conflicting language, we have created a repository and a development environment that can be used for future research on computable phenotypes.

17.
Gigascience ; 10(9)2021 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-34508578

RESUMO

BACKGROUND: High-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling. METHODS: A group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices. RESULTS: We present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing. CONCLUSIONS: There are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains.


Assuntos
Registros Eletrônicos de Saúde , Humanos , Fenótipo , Reprodutibilidade dos Testes
18.
AMIA Jt Summits Transl Sci Proc ; 2021: 142-151, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34457128

RESUMO

Phenotyping is an effective way to identify cohorts of patients with particular characteristics within a population. In order to enhance the portability of a phenotype definition across institutions, it is often defined abstractly, with implementers expected to realise the phenotype computationally before executing it against a dataset. However, un-clear definitions, with little information about how best to implement the definition in practice, hinder this process. To address this issue, we propose a new multi-layer, workflow-based model for defining phenotypes, and a novel authoring architecture, Phenoflow, that supports the development of these structured definitions and their realisation as computable phenotypes. To evaluate our model, we determine its impact on the portability of both code-based (COVID-19) and logic-based (diabetes) definitions, in the context of key datasets, including 26,406 patients at North-western University. Our approach is shown to ensure the portability of phenotype definitions and thus contributes to the transparency of resulting studies.


Assuntos
COVID-19 , Registros Eletrônicos de Saúde , Algoritmos , Humanos , Fenótipo , SARS-CoV-2 , Fluxo de Trabalho
19.
AMIA Jt Summits Transl Sci Proc ; 2021: 624-633, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34457178

RESUMO

Lack of standardized representation of natural language processing (NLP) components in phenotyping algorithms hinders portability of the phenotyping algorithms and their execution in a high-throughput and reproducible manner. The objective of the study is to develop and evaluate a standard-driven approach - CQL4NLP - that integrates a collection of NLP extensions represented in the HL7 Fast Healthcare Interoperability Resources (FHIR) standard into the clinical quality language (CQL). A minimal NLP data model with 11 NLP-specific data elements was created, including six FHIR NLP extensions. All 11 data elements were identified from their usage in real-world phenotyping algorithms. An NLP ruleset generation mechanism was integrated into the NLP2FHIR pipeline and the NLP rulesets enabled comparable performance for a case study with the identification of obesity comorbidities. The NLP ruleset generation mechanism created a reproducible process for defining the NLP components of a phenotyping algorithm and its execution.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Algoritmos , Comorbidade , Humanos , Idioma
20.
medRxiv ; 2021 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-34282422

RESUMO

Genomic regions have been associated with COVID-19 susceptibility and outcomes, including the chr12q24.13 locus encoding antiviral proteins OAS1-3. Here, we report genetic, functional, and clinical insights into genetic associations within this locus. In Europeans, the risk of hospitalized vs. non-hospitalized COVID-19 was associated with a single 19Kb-haplotype comprised of 76 OAS1 variants included in a 95% credible set within a large genomic fragment introgressed from Neandertals. The risk haplotype was also associated with impaired spontaneous but not treatment-induced SARS-CoV-2 clearance in a clinical trial with pegIFN-λ1. We demonstrate that two exonic variants, rs10774671 and rs1131454, affect splicing and nonsense-mediated decay of OAS1 . We suggest that genetically-regulated loss of OAS1 expression contributes to impaired spontaneous clearance of SARS-CoV-2 and elevated risk of hospitalization for COVID-19. Our results provide the rationale for further clinical studies using interferons to compensate for impaired spontaneous SARS-CoV-2 clearance, particularly in carriers of the OAS1 risk haplotypes.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...